Overview

Dataset statistics

Number of variables13
Number of observations2970
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory301.8 KiB
Average record size in memory104.0 B

Variable types

Numeric13

Warnings

gross_revenue is highly correlated with qtde_invoices and 1 other fieldsHigh correlation
qtde_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_items is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qtde_products is highly correlated with qtde_invoicesHigh correlation
avg_ticket is highly correlated with qtde_returns and 1 other fieldsHigh correlation
qtde_returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qtde_invoices and 3 other fieldsHigh correlation
recency_days is highly correlated with qtde_invoicesHigh correlation
qtde_invoices is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qtde_products and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qtde_invoices and 2 other fieldsHigh correlation
qtde_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with qtde_itemsHigh correlation
avg_unique_basket_size is highly correlated with qtde_productsHigh correlation
qtde_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_returns is highly correlated with avg_ticket and 3 other fieldsHigh correlation
avg_ticket is highly correlated with qtde_returns and 3 other fieldsHigh correlation
gross_revenue is highly correlated with qtde_invoices and 5 other fieldsHigh correlation
qtde_products is highly correlated with qtde_invoices and 3 other fieldsHigh correlation
avg_basket_size is highly correlated with qtde_returns and 3 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qtde_productsHigh correlation
qtde_items is highly correlated with qtde_invoices and 5 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 53.45322416) Skewed
frequency is highly skewed (γ1 = 24.88427542) Skewed
qtde_returns is highly skewed (γ1 = 51.80645659) Skewed
avg_basket_size is highly skewed (γ1 = 44.69064048) Skewed
df_index has unique values Unique
customer_id has unique values Unique
avg_ticket has unique values Unique
recency_days has 34 (1.1%) zeros Zeros
qtde_returns has 1481 (49.9%) zeros Zeros

Reproduction

Analysis started2021-06-06 18:35:12.343806
Analysis finished2021-06-06 18:35:32.435836
Duration20.09 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2970
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2317.86532
Minimum0
Maximum5716
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:32.514596image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.45
Q1929.25
median2120.5
Q33537.75
95-th percentile5036.1
Maximum5716
Range5716
Interquartile range (IQR)2608.5

Descriptive statistics

Standard deviation1555.136615
Coefficient of variation (CV)0.6709348474
Kurtosis-1.010761417
Mean2317.86532
Median Absolute Deviation (MAD)1271
Skewness0.3420323534
Sum6884060
Variance2418449.89
MonotonicityStrictly increasing
2021-06-06T15:35:32.623716image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
6351
 
< 0.1%
6251
 
< 0.1%
46681
 
< 0.1%
6271
 
< 0.1%
26761
 
< 0.1%
6291
 
< 0.1%
26781
 
< 0.1%
47271
 
< 0.1%
26801
 
< 0.1%
Other values (2960)2960
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
57161
< 0.1%
56971
< 0.1%
56871
< 0.1%
56811
< 0.1%
56601
< 0.1%
56561
< 0.1%
56501
< 0.1%
56391
< 0.1%
56381
< 0.1%
56281
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2970
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.71818
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:32.733193image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.45
Q113799.75
median15220.5
Q316767.75
95-th percentile17964.55
Maximum18287
Range5940
Interquartile range (IQR)2968

Descriptive statistics

Standard deviation1718.703373
Coefficient of variation (CV)0.112548955
Kurtosis-1.205496241
Mean15270.71818
Median Absolute Deviation (MAD)1487
Skewness0.03170847426
Sum45354033
Variance2953941.285
MonotonicityNot monotonic
2021-06-06T15:35:32.839960image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
163841
 
< 0.1%
133631
 
< 0.1%
129311
 
< 0.1%
129331
 
< 0.1%
129351
 
< 0.1%
149841
 
< 0.1%
170331
 
< 0.1%
137041
 
< 0.1%
129391
 
< 0.1%
170371
 
< 0.1%
Other values (2960)2960
99.7%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182771
< 0.1%
182761
< 0.1%
182741
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182691
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2964
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2748.69071
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:32.950134image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.8075
Q1571.02
median1088.53
Q32307.675
95-th percentile7217.565
Maximum279138.02
Range279131.82
Interquartile range (IQR)1736.655

Descriptive statistics

Standard deviation10578.74876
Coefficient of variation (CV)3.848650093
Kurtosis354.076566
Mean2748.69071
Median Absolute Deviation (MAD)674.05
Skewness16.78066089
Sum8163611.41
Variance111909925.3
MonotonicityNot monotonic
2021-06-06T15:35:33.053277image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
745.062
 
0.1%
3312
 
0.1%
379.652
 
0.1%
734.942
 
0.1%
533.332
 
0.1%
731.92
 
0.1%
1414.431
 
< 0.1%
923.411
 
< 0.1%
471.511
 
< 0.1%
13375.871
 
< 0.1%
Other values (2954)2954
99.5%
ValueCountFrequency (%)
6.21
< 0.1%
13.31
< 0.1%
151
< 0.1%
36.561
< 0.1%
451
< 0.1%
521
< 0.1%
52.21
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
140438.721
< 0.1%
124564.531
< 0.1%
117375.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.31447811
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:33.161844image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.75581705
Coefficient of variation (CV)1.208993983
Kurtosis2.774121666
Mean64.31447811
Median Absolute Deviation (MAD)26
Skewness1.797141709
Sum191014
Variance6045.967086
MonotonicityNot monotonic
2021-06-06T15:35:33.269763image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.3%
487
 
2.9%
285
 
2.9%
385
 
2.9%
876
 
2.6%
1067
 
2.3%
766
 
2.2%
966
 
2.2%
1764
 
2.2%
2255
 
1.9%
Other values (262)2220
74.7%
ValueCountFrequency (%)
034
 
1.1%
199
3.3%
285
2.9%
385
2.9%
487
2.9%
543
1.4%
766
2.2%
876
2.6%
966
2.2%
1067
2.3%
ValueCountFrequency (%)
3732
0.1%
3724
0.1%
3711
 
< 0.1%
3681
 
< 0.1%
3664
0.1%
3652
0.1%
3641
 
< 0.1%
3601
 
< 0.1%
3591
 
< 0.1%
3584
0.1%

qtde_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.721885522
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:33.661797image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.855303226
Coefficient of variation (CV)1.547619782
Kurtosis190.8826625
Mean5.721885522
Median Absolute Deviation (MAD)2
Skewness10.76805433
Sum16994
Variance78.41639523
MonotonicityNot monotonic
2021-06-06T15:35:33.769845image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2786
26.5%
3499
16.8%
4393
13.2%
5237
 
8.0%
1190
 
6.4%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
Other values (46)332
11.2%
ValueCountFrequency (%)
1190
 
6.4%
2786
26.5%
3499
16.8%
4393
13.2%
5237
 
8.0%
6173
 
5.8%
7138
 
4.6%
898
 
3.3%
969
 
2.3%
1055
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

qtde_items
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1665
Distinct (%)56.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1606.085185
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:33.882192image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile101.45
Q1296.25
median638
Q31398.75
95-th percentile4407.2
Maximum196844
Range196843
Interquartile range (IQR)1102.5

Descriptive statistics

Standard deviation5882.021385
Coefficient of variation (CV)3.662334626
Kurtosis467.3049728
Mean1606.085185
Median Absolute Deviation (MAD)419
Skewness17.88131633
Sum4770073
Variance34598175.58
MonotonicityNot monotonic
2021-06-06T15:35:34.005298image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
889
 
0.3%
1509
 
0.3%
2468
 
0.3%
848
 
0.3%
2608
 
0.3%
2888
 
0.3%
2728
 
0.3%
2007
 
0.2%
1347
 
0.2%
Other values (1655)2887
97.2%
ValueCountFrequency (%)
11
< 0.1%
22
0.1%
122
0.1%
161
< 0.1%
171
< 0.1%
181
< 0.1%
191
< 0.1%
201
< 0.1%
231
< 0.1%
251
< 0.1%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
799631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
628121
< 0.1%
582431
< 0.1%
577851
< 0.1%

qtde_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct469
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.683165
Minimum1
Maximum7837
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:34.134454image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7837
Range7836
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.7992435
Coefficient of variation (CV)2.199154575
Kurtosis354.9485422
Mean122.683165
Median Absolute Deviation (MAD)44
Skewness15.70855259
Sum364369
Variance72791.63181
MonotonicityNot monotonic
2021-06-06T15:35:34.253951image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2845
 
1.5%
2038
 
1.3%
3535
 
1.2%
1933
 
1.1%
1533
 
1.1%
2933
 
1.1%
1132
 
1.1%
2631
 
1.0%
2730
 
1.0%
1629
 
1.0%
Other values (459)2631
88.6%
ValueCountFrequency (%)
16
 
0.2%
214
0.5%
316
0.5%
417
0.6%
526
0.9%
629
1.0%
718
0.6%
819
0.6%
927
0.9%
1027
0.9%
ValueCountFrequency (%)
78371
< 0.1%
56701
< 0.1%
50951
< 0.1%
45771
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16361
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
UNIQUE

Distinct2970
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.8894298
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:34.373483image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.917434212
Q113.11982258
median17.97733464
Q324.98612169
95-th percentile90.49558333
Maximum56157.5
Range56155.34941
Interquartile range (IQR)11.86629911

Descriptive statistics

Standard deviation1036.759857
Coefficient of variation (CV)19.98017439
Kurtosis2891.680828
Mean51.8894298
Median Absolute Deviation (MAD)5.990148388
Skewness53.45322416
Sum154111.6065
Variance1074871
MonotonicityNot monotonic
2021-06-06T15:35:34.485846image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.492758621
 
< 0.1%
33.535714291
 
< 0.1%
17.628961751
 
< 0.1%
28.899687941
 
< 0.1%
46.074130431
 
< 0.1%
25.775384621
 
< 0.1%
8.7451724141
 
< 0.1%
18.150615381
 
< 0.1%
17.943444441
 
< 0.1%
15.98451
 
< 0.1%
Other values (2960)2960
99.7%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
3202.921
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%

avg_recency_days
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1258
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.33840526
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:34.603647image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.94642857
median48.26785714
Q385.33333333
95-th percentile201
Maximum366
Range365
Interquartile range (IQR)59.38690476

Descriptive statistics

Standard deviation63.53609292
Coefficient of variation (CV)0.9435342681
Kurtosis4.890127492
Mean67.33840526
Median Absolute Deviation (MAD)26.26785714
Skewness2.063409073
Sum199995.0636
Variance4036.835104
MonotonicityNot monotonic
2021-06-06T15:35:34.711185image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1425
 
0.8%
422
 
0.7%
7021
 
0.7%
720
 
0.7%
3519
 
0.6%
4918
 
0.6%
4617
 
0.6%
1117
 
0.6%
2117
 
0.6%
116
 
0.5%
Other values (1248)2778
93.5%
ValueCountFrequency (%)
116
0.5%
1.51
 
< 0.1%
213
0.4%
2.51
 
< 0.1%
2.6013986011
 
< 0.1%
315
0.5%
3.3214285711
 
< 0.1%
3.3303571431
 
< 0.1%
3.52
 
0.1%
422
0.7%
ValueCountFrequency (%)
3661
 
< 0.1%
3651
 
< 0.1%
3631
 
< 0.1%
3621
 
< 0.1%
3572
0.1%
3561
 
< 0.1%
3552
0.1%
3521
 
< 0.1%
3512
0.1%
3503
0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1225
Distinct (%)41.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1137645194
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:34.826556image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008894823607
Q10.01633986928
median0.02589835169
Q30.04938789715
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.03304802787

Descriptive statistics

Standard deviation0.4080910039
Coefficient of variation (CV)3.587155346
Kurtosis989.6727708
Mean0.1137645194
Median Absolute Deviation (MAD)0.0121968864
Skewness24.88427542
Sum337.8806226
Variance0.1665382675
MonotonicityNot monotonic
2021-06-06T15:35:34.939827image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1198
 
6.7%
0.0277777777817
 
0.6%
0.062517
 
0.6%
0.0238095238116
 
0.5%
0.0344827586215
 
0.5%
0.0833333333315
 
0.5%
0.0909090909115
 
0.5%
0.0294117647114
 
0.5%
0.0357142857113
 
0.4%
0.0256410256413
 
0.4%
Other values (1215)2637
88.8%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
26
 
0.2%
1.1428571431
 
< 0.1%
1198
6.7%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%
0.53
 
0.1%

qtde_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct214
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.13670034
Minimum0
Maximum80995
Zeros1481
Zeros (%)49.9%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:35.054435image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile100.55
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1512.241801
Coefficient of variation (CV)24.33733676
Kurtosis2766.459377
Mean62.13670034
Median Absolute Deviation (MAD)1
Skewness51.80645659
Sum184546
Variance2286875.265
MonotonicityNot monotonic
2021-06-06T15:35:35.167021image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2149
 
5.0%
3105
 
3.5%
489
 
3.0%
678
 
2.6%
561
 
2.1%
1251
 
1.7%
743
 
1.4%
843
 
1.4%
Other values (204)706
23.8%
ValueCountFrequency (%)
01481
49.9%
1164
 
5.5%
2149
 
5.0%
3105
 
3.5%
489
 
3.0%
561
 
2.1%
678
 
2.6%
743
 
1.4%
843
 
1.4%
941
 
1.4%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1974
Distinct (%)66.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean249.3205793
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:35.284366image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.2625
median172
Q3281.4583333
95-th percentile599.46
Maximum40498.5
Range40497.5
Interquartile range (IQR)178.1958333

Descriptive statistics

Standard deviation791.3706789
Coefficient of variation (CV)3.174108937
Kurtosis2256.993743
Mean249.3205793
Median Absolute Deviation (MAD)82.625
Skewness44.69064048
Sum740482.1206
Variance626267.5514
MonotonicityNot monotonic
2021-06-06T15:35:35.397719image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
11410
 
0.3%
829
 
0.3%
739
 
0.3%
869
 
0.3%
1368
 
0.3%
608
 
0.3%
888
 
0.3%
758
 
0.3%
2887
 
0.2%
Other values (1964)2883
97.1%
ValueCountFrequency (%)
12
0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
42821
< 0.1%
39061
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
28011
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1010
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.15401143
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.3 KiB
2021-06-06T15:35:35.514147image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.346969697
Q110
median17.2
Q327.75
95-th percentile56.9325
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation19.50983265
Coefficient of variation (CV)0.8806455983
Kurtosis27.70524325
Mean22.15401143
Median Absolute Deviation (MAD)8.2
Skewness3.498956413
Sum65797.41393
Variance380.6335699
MonotonicityNot monotonic
2021-06-06T15:35:35.623099image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1353
 
1.8%
1440
 
1.3%
1138
 
1.3%
933
 
1.1%
2033
 
1.1%
132
 
1.1%
1831
 
1.0%
1030
 
1.0%
1629
 
1.0%
1728
 
0.9%
Other values (1000)2623
88.3%
ValueCountFrequency (%)
132
1.1%
1.21
 
< 0.1%
1.251
 
< 0.1%
1.3333333332
 
0.1%
1.58
 
0.3%
1.5681818181
 
< 0.1%
1.5714285711
 
< 0.1%
1.6666666674
 
0.1%
1.8333333331
 
< 0.1%
224
0.8%
ValueCountFrequency (%)
299.70588241
< 0.1%
2591
< 0.1%
203.51
< 0.1%
1481
< 0.1%
1451
< 0.1%
136.1251
< 0.1%
135.51
< 0.1%
1271
< 0.1%
1221
< 0.1%
1181
< 0.1%

Interactions

2021-06-06T15:35:14.510935image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:14.613456image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.287634image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.376856image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.469032image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.556715image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.653129image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.747615image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.836117image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:15.928673image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.021185image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.117452image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.218331image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.310074image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.398746image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.486629image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.574211image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.664664image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.758087image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.853103image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:16.946079image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.034210image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.126122image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.221025image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.314570image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.409108image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.498187image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.585980image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.672526image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.758914image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.848444image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:17.932431image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.025251image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.117649image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.203422image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.295325image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.537147image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.629449image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.721506image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.815428image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:18.921371image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.016542image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.111122image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.207817image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.299420image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.397450image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.495340image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.587944image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.683305image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.787999image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.894924image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:19.993839image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.095723image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.182274image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.266282image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.361565image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.459280image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.548495image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.638327image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.727527image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.812206image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:20.913514image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.008090image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.098220image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.187765image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.273128image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.379248image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.485242image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.596627image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.711509image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.807413image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:21.910923image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.016322image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.112366image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.213384image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.313656image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.420397image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.717052image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.815628image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:22.912723image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.009569image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.105345image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.202986image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.295038image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.396495image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.499624image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.602199image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.715229image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.813708image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:23.925083image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.032869image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.136214image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.236708image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.327417image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.421053image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.538831image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.637738image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.729662image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.821755image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.907877image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:24.997676image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.087448image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.179487image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.271006image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.359073image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.455177image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.547527image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.639411image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.734652image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.824461image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:25.922574image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.021941image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.114543image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.209164image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.311634image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.414372image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.512866image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.607255image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.700857image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.795525image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.888392image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:26.983137image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:27.072308image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:27.169569image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:27.270092image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:27.363432image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:27.702941image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:27.800987image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:27.903072image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.007550image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.101814image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.198057image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.292713image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.387689image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.485331image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.577465image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.678022image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.777920image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.872316image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:28.975577image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.074429image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.175738image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.278494image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.376017image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.474781image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.571647image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.668913image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.769345image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.864172image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:29.967312image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.070895image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.176764image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.285691image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.397458image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.512886image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.621643image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.719655image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.811944image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.903680image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:30.996287image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.092244image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.183252image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.288704image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.394562image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.499439image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.601146image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.705540image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.817255image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-06-06T15:35:31.928208image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-06-06T15:35:35.728248image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-06T15:35:35.879907image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-06T15:35:36.028991image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-06T15:35:36.178217image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-06T15:35:32.113607image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-06T15:35:32.358127image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.034.01733.0297.018.15222235.50000017.00000040.050.9705888.735294
11130473232.5956.09.01390.0171.018.90403527.2500000.02830235.0154.44444419.000000
22125836705.382.015.05028.0232.028.90250023.1875000.04032350.0335.20000015.466667
3313748948.2595.05.0439.028.033.86607192.6666670.0179210.087.8000005.600000
4415100876.00333.03.080.03.0292.0000008.6000000.07317122.026.6666671.000000
55152914623.3025.014.02102.0102.045.32647123.2000000.04011529.0150.1428577.285714
66146885630.877.021.03621.0327.017.21978618.3000000.057221399.0172.42857115.571429
77178095411.9116.012.02057.061.088.71983635.7000000.03352041.0171.4166675.083333
881531160767.900.091.038194.02379.025.5434644.1444440.243316474.0419.71428626.142857
99160982005.6387.07.0613.067.029.93477647.6666670.0243900.087.5714299.571429

Last rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
29605628177271060.2515.01.0645.066.016.0643946.01.0000006.0645.00000066.0
2961563817232421.522.02.0203.036.011.70888912.00.1538460.0101.50000018.0
2962563917468137.0010.02.0116.05.027.4000004.00.4000000.058.0000002.5
2963565013596697.045.02.0406.0166.04.1990367.00.2500000.0203.00000083.0
29645656148931237.859.02.0799.073.016.9568492.00.6666670.0399.50000036.5
2965566012479473.2011.01.0382.030.015.7733334.01.00000034.0382.00000030.0
2966568114126706.137.03.0508.015.047.0753333.00.75000050.0169.3333335.0
29675687135211092.391.03.0733.0435.02.5112414.50.3000000.0244.333333145.0
2968569715060301.848.04.0262.0120.02.5153331.02.0000000.065.50000030.0
2969571612558269.967.01.0196.011.024.5418186.01.000000196.0196.00000011.0